4 research outputs found

    Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube

    Full text link
    Three-dimensional (3D)-stacking technology, which enables the integration of DRAM and logic dies, offers high bandwidth and low energy consumption. This technology also empowers new memory designs for executing tasks not traditionally associated with memories. A practical 3D-stacked memory is Hybrid Memory Cube (HMC), which provides significant access bandwidth and low power consumption in a small area. Although several studies have taken advantage of the novel architecture of HMC, its characteristics in terms of latency and bandwidth or their correlation with temperature and power consumption have not been fully explored. This paper is the first, to the best of our knowledge, to characterize the thermal behavior of HMC in a real environment using the AC-510 accelerator and to identify temperature as a new limitation for this state-of-the-art design space. Moreover, besides bandwidth studies, we deconstruct factors that contribute to latency and reveal their sources for high- and low-load accesses. The results of this paper demonstrates essential behaviors and performance bottlenecks for future explorations of packet-switched and 3D-stacked memories.Comment: EEE Catalog Number: CFP17236-USB ISBN 13: 978-1-5386-1232-

    Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube

    Full text link
    Memories that exploit three-dimensional (3D)-stacking technology, which integrate memory and logic dies in a single stack, are becoming popular. These memories, such as Hybrid Memory Cube (HMC), utilize a network-on-chip (NoC) design for connecting their internal structural organizations. This novel usage of NoC, in addition to aiding processing-in-memory capabilities, enables numerous benefits such as high bandwidth and memory-level parallelism. However, the implications of NoCs on the characteristics of 3D-stacked memories in terms of memory access latency and bandwidth have not been fully explored. This paper addresses this knowledge gap by (i) characterizing an HMC prototype on the AC-510 accelerator board and revealing its access latency behaviors, and (ii) by investigating the implications of such behaviors on system and software designs

    A feedback methodology for task-driven fine-grained pixel control in smart cameras

    No full text
    Camera systems of today capture signals at the highest quality to produce a faithful approximation of what they observe. However, the available bandwidth limits the amount of data the camera can transmit. Advances in camera technologies will only compound the problem further as advent of digital pixel technologies and 3D integration promises unprecedented gains in resolution and frame rates of cameras. As cameras are being increasingly used to drive many mission-critical autonomous applications ranging from traffic monitoring to disaster recovery to defense, a uni-directional processing pipeline misses the opportunity to create a 'true' smart camera. In such applications ‘useful information’ depends on the tasks and is defined using complex features, rather than only changes in captured signal. In this research, we tackle this problem by proposing a smart imager that applies high level task-driven feedback at the input space. Specifically, this camera system only captures useful information pertaining to an end-user defined task and at the highest quality. The feedback system applies the task feedback at the encoder and sensor layer. The proposed camera enhances the performance of the task while being bandwidth efficient. Uncertainty information is incorporated into the feedback path to improve performance in challenging scenarios with many false positives and false negatives. Lastly, we address visual challenges that impact the feedback control such as small object detection and moving camera action detection.Ph.D

    Design and implementation of a content aware image processing module on FPGA

    Get PDF
    In this thesis, we tackle the problem of designing and implementing a wireless video sensor network for a surveillance application. The goal was to design a low power content aware system that is able to take an image from an image sensor, determine blocks in the image that contain important information and encode those block for transmission thus reducing the overall transmission effort. At the same time, the encoder and the preprocessor must not consume so much computation power that the utility of this system is lost. We have implemented such a system which uses a combination of Edge Detection and Frame Differencing to determine useful information within an image. A JPEG encoder then encodes the important blocks for transmission. An implementation on a FPGA is presented in this work. This work demonstrates that preprocessing gives us a 48.6 % reduction in power for a single frame while maintaining a delivery ratio of above 85 % for the given set of test frames.M.S
    corecore